A hexagonally connected processor array for Jacobi-type matrix algorithms∗

نویسندگان

  • Marc Moonen
  • Joos Vandewalle
چکیده

Indexing terms : Parallel algorithms, systolic arrays, matrix computation Jacobi-type matrix algorithms are mostly implemented on orthogonally connected processor arrays. In this letter, an alternative partitioning is described, resulting in a grid of hexagonally connected processors. This partitioning is shown to be over four times more efficient, as compared to the original configuration. Introduction Linear algebra and matrix computation play a central role in modern digital signal processing. Representative areas in which matrix computation is important include beamforming, direction finding, spectrum analysis, image processing, etc. For applications with substantial computational requirements, e.g. when a real time implementation is aimed at, parallel algorithms and architectures are indispensable. Systolic or wavefront arrays then provide a means of exploiting large amounts of parallelism and hence achieving orders of magnitude improvement in performance, consistent with the requirements1. Practical parallel algorithms for most of the key matrix operations are by now available. As for the matrix decomposition problems, most often methods are selected that are based on locally computed plane transformations. All of these are inspired by Jacobi’s algorithm for the symmetric eigenvalue decomposition (SEVD). Parallel Jacobitype algorithms are available for the Schur decomposition2, the singular value decomposition (SVD)3, generalized SVD’s, the QR decomposition, etc. In the next section, we first briefly review the conventional systolic implementation for Jacobi-type algorithms. The processor utilization is seen to be particularly low, as each processor is active only one fourth of the time. Next, we show how a different partitioning results in a completely different array, which is over four times more efficient. Systolic arrays for Jacobi-type algorithms ∗This work was sponsored (partly) by the BRA 3280 project of the EC.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Processor-Time-Minimal Systolic Array for Cubical Mesh Algorithms

Using a directed acyclic graph (dag) model of algorithms, the paper focuses on time-minimal multiprocessor schedules that use as few processors as possible. Such a processor-time-minimal scheduling of an algorithm’s dag first is illustrated using a triangular shaped 2D directed mesh (representing, for example, an algorithm for solving a triangular system of linear equations). Then, algorithms r...

متن کامل

Design and Implementation of Field Programmable Gate Array Based Baseband Processor for Passive Radio Frequency Identification Tag (TECHNICAL NOTE)

In this paper, an Ultra High Frequency (UHF) base band processor for a passive tag is presented. It proposes a Radio Frequency Identification (RFID) tag digital base band architecture which is compatible with the EPC C C2/ISO18000-6B protocol. Several design approaches such as clock gating technique, clock strobe design and clock management are used. In order to reduce the area Decimal Matrix C...

متن کامل

Highly Parallel Hardware-oriented Algorithm for Jacobi SVD of Hermitian Quaternion Valued Matrix

In this study, new highly parallel algorithm of two-sided Jacobi 8-D transformation is suggested. It is oriented on VLSI-implementation of special processor array. This array is built using 8-D CORDIC algorithm for quaternion valued matrix singular value decomposition. Accuracy analysis and simulation results are added. Such array can be utilized to speed up the Jacobi method realization to com...

متن کامل

Low Power Algorithms for Signal Processing

In this article we present techniques to reduce power consumption in parallel processor arrays. The reduction is substantially reached by avoiding non-essential operations, i. e. operations that hardly contribute to the convergence of the considered algorithm. The presented approach is clarified on the basis of a parallel implementation of a Jacobi-like real eigenvalue decomposition (EVD) but s...

متن کامل

A Systolic VLSI Architecture for Complex SVD

This thesis presents a systolic algorithm for the SVD of arbitrary complex matrices, based on the cyclic Jacobi method with \parallel ordering". As a basic step in the algorithm, a two-step, two-sided unitary transformation scheme is employed to diagonalize a complex 2 2 matrix. The transformations are tailored to the use of CORDIC (COordinate Rotation Digital Computer) algorithms for high spee...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990